Search Result

Select

Recognition model for French named entities based on deep neural network

YAN Hong, CHEN Xingshu, WANG Wenxian, WANG Haizhou, YIN Mingyong

Journal of Computer Applications 2019, 39 (5): 1288-1292. DOI: 10.11772/j.issn.1001-9081.2018102155

Abstract （465）

PDF （796KB）（544）

Save

In the existing French Named Entity Recognition (NER) research, the machine learning models mostly use the character morphological features of words, and the multilingual generic named entity models use the semantic features represented by word embedding, both without taking into account the semantic, character morphological and grammatical features comprehensively. Aiming at this shortcoming, a deep neural network based model CGC-fr was designed to recognize French named entity. Firstly, word embedding, character embedding and grammar feature vector were extracted from the text. Then, character feature was extracted from the character embedding sequence of words by using Convolution Neural Network (CNN). Finally, Bi-directional Gated Recurrent Unit Network (BiGRU) and Conditional Random Field (CRF) were used to label named entities in French text according to word embedding, character feature and grammar feature vector. In the experiments, F1 value of CGC-fr model can reach 82.16% in the test set, which is 5.67 percentage points, 1.79 percentage points and 1.06 percentage points higher than that of NERC-fr, LSTM(Long Short-Term Memory network)-CRF and Char attention models respectively. The experimental results show that CGC-fr model with three features is more advantageous than the others.

Reference | Related Articles | Metrics

Select

Short-term bus load forecasting based on hierarchical clustering method and extreme learning machine

YAN Hongwen, SHENG Chenggong

Journal of Computer Applications 2018, 38 (8): 2437-2441. DOI: 10.11772/j.issn.1001-9081.2018010017

Abstract （707）

PDF （773KB）（333）

Save

Traditionally, days before the forecast day are usually selected as historical similar days for training the forecasting model to forecast bus load, without considering the effects of weather situation, weekday and vacation information. Therefore, traditional methods will cause differences of daily characteristics between historical similar days and the forecast day. To solve the problem, a new bus load forecasting method based on Hierarchical Clustering (HC) and Extreme Learning Machine (ELM) was proposed. Firstly, HC method was used for clustering the historical daily bus load. Secondly, a decision tree based on the clustering results was constructed. Thirdly, according to the properties of the forecast day, such as temperature, humidity, weekday and vacation information, historical daily bus load was obtained to train the forecasting model of extreme learning machine through the decision tree. Finally, the forecasting model was established to predict the bus load. When forecasting load of two different buses, compared with traditional single ELM, the proposed algorithm decreases the Mean Absolute Percentage Error (MAPE) by 1.4 percentage points and 0.8 percentage points. The experimental results show that the proposed method has higher accuracy and better stability for forecasting short-term bus load.

Reference | Related Articles | Metrics

Select

Outlier detection in time series data based on heteroscedastic Gaussian processes

YAN Hong, YANG Bo, YANG Hongyu

Journal of Computer Applications 2018, 38 (5): 1346-1352. DOI: 10.11772/j.issn.1001-9081.2017102511

Abstract （578）

PDF （1092KB）（418）

Save

Generally, there are inevitable disturbances in time series data, such as inherent uncertainties and external interferences. To detect outlier in time series data with time-varying disturbances, an approach based on prediction model using Gaussian Processes was proposed. The monitoring data was decomposed into two components:the standard value and the deviation term. As the basis of model for the ideal standard value without any deviation, Gaussian processes were also employed to model the heteroscedastic deviations. The posterior distribution of predicted data which is analytically intractable after introducing deviation term was approximated by variational inference. The tolerance interval selected from posterior distribution was used for outlier detection. Verification experiments were conducted on the public time series datasets of network traffic from Yahoo. The calculated tolerance interval coincided with the actual range of reasonable deviation existing in labeled normal data at various time points. In the comparison experiments, the proposed model outperformed autoregressive integrated moving average model, one-class support vector machine and Density-Based Spatial Clustering of Application with Noise (DBSCAN) in terms of F1-score. The experimental results show that the proposed model can effectively describe the distribution of normal data at various time points, achieve a tradeoff between false alarm rate and recall, and avoid the performance problems caused by improper parameter settings.

Reference | Related Articles | Metrics

Select

Video super resolution method based on structure tensor

YAN Honghai, PU Fangling, XU Xin

Journal of Computer Applications 2016, 36 (7): 1944-1948. DOI: 10.11772/j.issn.1001-9081.2016.07.1944

Abstract （410）

PDF （996KB）（392）

Save

The parameter of traditional regularized Super Resolution (SR) reconstruction model is difficult to choose:the higher parameter value results in blurred reconstruction and the fading of edge and detail, while the lower parameter value weakens the denosing ability. A double regularization parameters super resolution reconstruction method based on structure tensor was proposed. Firstly, smooth region and edge was detected by using local structure tensor. Secondly, the Total Variation (TV) was weighted with the priori information of difference curvature. Finally, two different parameters toward smooth region and edge were used to reconstruct super resolution image. The experimental data show that the proposed algorithm can improve the Peak Signal-to-Noise Ratio (PSNR) of 0.033-0.11 dB, and get better reconstruction results. The proposed algorithm can effectively improve the reconstruction effect of Low Resolution (LR) video frames, and can be applied to LR video enhancement, license plate recognition and the interest target enhancement in video surveillance, etc.

Reference | Related Articles | Metrics

Select

Evaluation model based on dual-threshold constrained tolerance dominance relation

YU Shunkun, YAN Hongxu

Journal of Computer Applications 2016, 36 (11): 3131-3135. DOI: 10.11772/j.issn.1001-9081.2016.11.3131

Abstract （568）

PDF （831KB）（323）

Save

Considering the problems that classical dominance relation rough set is too strict about attribute values in solving the dominant class which may lead to the failure of the evaluation model, and single-threshold constrained tolerance dominance relation rough set was too loose about attribute number which may cause inconsistency between the evaluation results and human cognitive judgment in the ordered information system, a rough evaluation model based on dual-threshold constraint tolerance relation was proposed. Firstly, the concept of dual-threshold constrained tolerance dominance relation was proposed and its relevant properties were studied. Then, based on the extended dominance relation, the definition of dominant degree was proposed and a rough evaluation model was built by using statistical analysis method. Finally, the model was applied to the comprehensive strength evaluation of the regional building industry and the sorting results were verified in comparison with the results by using classical dominance relation rough set. According to the results, the proposed model presents more rationality and high efficiency in multi-attribute decision issues.

Reference | Related Articles | Metrics

Select

Novel K-medoids clustering algorithm based on breadth-first search

YAN Hongwen, ZHOU Yamei, PAN Chu

Journal of Computer Applications 2015, 35 (5): 1302-1305. DOI: 10.11772/j.issn.1001-9081.2015.05.1302

Abstract （554）

PDF （626KB）（609）

Save

Due to the disadvantages such as sensitivity to the initial selection of the center, random selection of centers and poor accuracy in traditional K-medoids clustering algorithm, a breadth-first search strategy for centers was proposed on the basis of granular computing effective initialization. The new algorithm selected K granules firstly using granular computing and selected their corresponding centers as the K initial centers. Secondly, according to the similarity between objects, the proposed algorithm set up binary tree of similar objects separately where the corresponding initial centers were taken as the root nodes, and then used breadth-first search to traverse the binary tree to find out K optimal centers. What's more, the fitness function was optimized by using within-cluster distance and between-cluster distance. The experimental results on standard data set Iris and Wine in UCI show that this proposed algorithm effectively reduces the number of iterations and guarantees the accuracy of clustering at the same time.

Reference | Related Articles | Metrics

Select

Research on the data grid technique for earthquake disaster alleviation scientific computing grid system

YAN Hong-mei,ZHAN Shou-yi,YANG Fang-ting

Journal of Computer Applications 2005, 25 (09): 2182-2184. DOI: 10.3724/SP.J.1087.2005.02182

Abstract （995）

PDF （186KB）（839）

Save

Locating,accessing,retrieving data efficiently is a main problem needed to be solved in WAN.Data Grid is an efficient way to solve the problem.The virtual Mete data model was used to put forward the unique data accessing interface.The unique data accessing interface was used to build the Mete data Directory Service.Finally the design concept was used to build the data management in Earthquake Disaster Alleviation Simulation Grid System.The efficient management on data resource was implemented.